After exploring the general pattern of modelling GPP vs observational GPP, the next step to identify the specific period when the mismatch between modeled GPP and observed GPP in each site–>focused in the markdown file
step1: tidy the table for GPP simulation vs GPP obs sites
step2: finding the way to separate out the model early simulation period
library(kableExtra)
library("readxl")
table.path<-"D:/data/photocold_project/sel_sites_info/Using_sites_in_Fluxnet2015/"
load(file=paste0(table.path,"df_sites_avai.RDA"))
my_data<-df_sites_avai
my_data %>%
kbl(caption = "Summary of sites with GPP estimation ") %>%
kable_classic(full_width = F, html_font = "Cambria")
| sitename | lon | lat | elv | year_start | year_end | classid | c4 | whc | koeppen_code | igbp_land_use | plant_functional_type |
|---|---|---|---|---|---|---|---|---|---|---|---|
| BE-Bra | 4.5206 | 51.3092 | 16 | 1996 | 2014 | MF | FALSE | 85.68380 | Cfb | Mixed Forests | Deciduous Broadleaf Trees |
| BE-Vie | 5.9981 | 50.3051 | 493 | 1996 | 2014 | MF | FALSE | 312.76520 | Cfb | Mixed Forests | Deciduous Broadleaf Trees |
| CA-Man | -98.4808 | 55.8796 | 259 | 1994 | 2008 | ENF | FALSE | 52.67784 | Dfc | Evergreen Needleleaf Forest | Evergreen Needleleaf Trees |
| CA-NS1 | -98.4839 | 55.8792 | 260 | 2001 | 2005 | ENF | FALSE | 50.25988 | Dfc | Evergreen Needleleaf Forest | Evergreen Needleleaf Trees |
| CA-NS2 | -98.5247 | 55.9058 | 260 | 2001 | 2005 | ENF | FALSE | 59.02733 | Dfc | Evergreen Needleleaf Forest | Evergreen Needleleaf Trees |
| CA-NS3 | -98.3822 | 55.9117 | 260 | 2001 | 2005 | ENF | FALSE | 115.96288 | Dfc | Evergreen Needleleaf Forest | Evergreen Needleleaf Trees |
| CA-NS4 | -98.3822 | 55.9117 | 260 | 2002 | 2005 | ENF | FALSE | 115.96288 | Dfc | Evergreen Needleleaf Forest | Evergreen Needleleaf Trees |
| CA-NS5 | -98.4850 | 55.8631 | 260 | 2001 | 2005 | ENF | FALSE | 32.74040 | Dfc | Evergreen Needleleaf Forest | Evergreen Needleleaf Trees |
| CA-Qfo | -74.3421 | 49.6925 | 382 | 2003 | 2010 | ENF | FALSE | 176.82556 | Dfc | Evergreen Needleleaf Forest | Evergreen Needleleaf Trees |
| CA-SF1 | -105.8176 | 54.4850 | 536 | 2003 | 2006 | ENF | FALSE | 265.99557 | Dfc | Evergreen Needleleaf Forest | Evergreen Needleleaf Trees |
| CA-SF2 | -105.8775 | 54.2539 | 520 | 2001 | 2005 | ENF | FALSE | 286.65930 | Dfc | Mixed Forests | Evergreen Needleleaf Trees |
| CH-Lae | 8.3650 | 47.4781 | 689 | 2004 | 2014 | MF | FALSE | 292.45551 | Cfb | Mixed Forests | Deciduous Broadleaf Trees |
| CN-Qia | 115.0581 | 26.7414 | 64 | 2003 | 2005 | ENF | FALSE | 303.67596 | Cfa | Woody Savannas | Shrub |
| CZ-BK1 | 18.5369 | 49.5021 | 875 | 2004 | 2008 | ENF | FALSE | 260.95676 | Dfb | Evergreen Needleleaf Forest | Evergreen Needleleaf Trees |
| DE-Hai | 10.4530 | 51.0792 | 430 | 2000 | 2012 | DBF | FALSE | 282.66736 | Cfb | Mixed Forests | Deciduous Broadleaf Trees |
| DE-Lkb | 13.3047 | 49.0996 | 1308 | 2009 | 2013 | ENF | FALSE | 189.99904 | Cfb | Evergreen Needleleaf Forest | Evergreen Needleleaf Trees |
| DE-Obe | 13.7196 | 50.7836 | 735 | 2008 | 2014 | ENF | FALSE | 246.86536 | Cfb | Evergreen Needleleaf Forest | Evergreen Needleleaf Trees |
| DE-Tha | 13.5669 | 50.9636 | 380 | 1996 | 2014 | ENF | FALSE | 295.66315 | Cfb | Evergreen Needleleaf Forest | Evergreen Needleleaf Trees |
| DK-Sor | 11.6446 | 55.4859 | 40 | 1996 | 2014 | DBF | FALSE | 226.43781 | Cfb | Deciduous Broadleaf Forest | Deciduous Broadleaf Trees |
| FI-Hyy | 24.2950 | 61.8475 | 181 | 1996 | 2014 | ENF | FALSE | 255.05896 | Dfc | Evergreen Needleleaf Forest | Evergreen Needleleaf Trees |
| FR-Fon | 2.7801 | 48.4764 | 103 | 2005 | 2014 | DBF | FALSE | 335.19290 | Cfb | Deciduous Broadleaf Forest | Deciduous Broadleaf Trees |
| FR-LBr | -0.7693 | 44.7171 | 61 | 1996 | 2008 | ENF | FALSE | 269.57657 | Cfb | Cropland/Natural Vegetation Mosaic | Shrub |
| IT-Col | 13.5881 | 41.8494 | 1560 | 1996 | 2014 | DBF | FALSE | 267.97675 | Cfa | Deciduous Broadleaf Forest | Deciduous Broadleaf Trees |
| IT-Isp | 8.6336 | 45.8126 | 210 | 2013 | 2014 | DBF | FALSE | 320.68103 | Cfb | Woody Savannas | Deciduous Broadleaf Trees |
| IT-La2 | 11.2853 | 45.9542 | 1350 | 2000 | 2002 | ENF | FALSE | 237.59509 | Cfb | Evergreen Needleleaf Forest | Evergreen Needleleaf Trees |
| IT-Lav | 11.2813 | 45.9562 | 1353 | 2003 | 2014 | ENF | FALSE | 249.79709 | Cfb | Evergreen Needleleaf Forest | Evergreen Needleleaf Trees |
| IT-PT1 | 9.0610 | 45.2009 | 60 | 2002 | 2004 | DBF | FALSE | 317.98535 | Cfa | Croplands | Cereal crop |
| IT-Ren | 11.4337 | 46.5869 | 1730 | 1998 | 2013 | ENF | FALSE | 167.45172 | Dfc | Evergreen Needleleaf Forest | Evergreen Needleleaf Trees |
| JP-MBF | 142.3186 | 44.3869 | 545 | 2003 | 2005 | DBF | FALSE | 214.18483 | Dfb | Mixed Forests | Deciduous Broadleaf Trees |
| JP-SMF | 137.0788 | 35.2617 | 175 | 2002 | 2006 | MF | FALSE | 294.94739 | Cfa | Croplands | Cereal crop |
| NL-Loo | 5.7436 | 52.1666 | 25 | 1996 | 2013 | ENF | FALSE | 71.05942 | Cfb | Evergreen Needleleaf Forest | Evergreen Needleleaf Trees |
| RU-Fyo | 32.9221 | 56.4615 | 265 | 1998 | 2014 | ENF | FALSE | 301.45709 | Dfb | Mixed Forests | Evergreen Needleleaf Trees |
| US-GBT | -106.2397 | 41.3658 | 3191 | 1999 | 2006 | ENF | FALSE | 219.37785 | Dfc | Evergreen Needleleaf Forests | (null) |
| US-GLE | -106.2399 | 41.3665 | 3197 | 2004 | 2014 | ENF | FALSE | 207.54053 | Dfb | Evergreen Needleleaf Forest | Evergreen Needleleaf Trees |
| US-Ha1 | -72.1715 | 42.5378 | 340 | 1991 | 2012 | DBF | FALSE | 193.85033 | Dfb | Mixed Forests | Deciduous Broadleaf Trees |
| US-MMS | -86.4131 | 39.3232 | 275 | 1999 | 2014 | DBF | FALSE | 343.01581 | Cfa | Deciduous Broadleaf Forest | Deciduous Broadleaf Trees |
| US-NR1 | -105.5464 | 40.0329 | 3050 | 1998 | 2014 | ENF | FALSE | 170.75986 | Dfc | Evergreen Needleleaf Forest | Evergreen Needleleaf Trees |
| US-PFa | -90.2723 | 45.9459 | 470 | 1995 | 2014 | MF | FALSE | 203.73933 | Dfb | Mixed Forests | Deciduous Broadleaf Trees |
| US-Prr | -147.4876 | 65.1237 | 210 | 2010 | 2013 | ENF | FALSE | 382.20374 | Dfc | Evergreen Needleleaf Forests | Evergreen Needleleaf Trees |
| US-Syv | -89.3477 | 46.2420 | 540 | 2001 | 2014 | MF | FALSE | 222.69208 | Dfb | Mixed Forests | Deciduous Broadleaf Trees |
| US-UMB | -84.7138 | 45.5598 | 234 | 2000 | 2014 | DBF | FALSE | 174.07025 | Dfb | Deciduous Broadleaf Forest | Deciduous Broadleaf Trees |
| US-UMd | -84.6975 | 45.5625 | 239 | 2007 | 2014 | DBF | FALSE | 235.31183 | Dfb | Mixed Forests | Deciduous Broadleaf Trees |
| US-WCr | -90.0799 | 45.8059 | 520 | 1999 | 2014 | DBF | FALSE | 264.96152 | Dfb | Deciduous Broadleaf Forest | Deciduous Broadleaf Trees |
| US-Wi0 | -91.0814 | 46.6188 | 349 | 2002 | 2002 | ENF | FALSE | 325.71179 | Dfb | Mixed Forests | Evergreen Needleleaf Trees |
| US-Wi3 | -91.0987 | 46.6347 | 411 | 2002 | 2004 | DBF | FALSE | 343.67532 | Dfb | Deciduous Broadleaf Forest | Deciduous Broadleaf Trees |
| US-Wi4 | -91.1663 | 46.7393 | 352 | 2002 | 2005 | ENF | FALSE | 299.29538 | Dfb | Mixed Forests | Evergreen Needleleaf Trees |
| US-Wi9 | -91.0814 | 46.6188 | 350 | 2004 | 2005 | ENF | FALSE | 325.71179 | Dfb | Mixed Forests | Evergreen Needleleaf Trees |
(1) For Cfa:both for DBF, MF, and ENF sites(5 sites–>used 5 sites at the end)
my_data_Cfa<-my_data[my_data$koeppen_code=="Cfa",]
## [1] 9
## [1] 2
## [1] 15
- Cfa-MF (1 site)
## [1] 4
- Cfa-ENF (1 site)
## [1] 3
(2) For Cfb: for DBF,MF and ENF (14 sites–>used 12 sites at the end)
my_data_Cfb<-my_data[my_data$koeppen_code=="Cfb",]
## [1] 13
## [1] 13
## [1] 8
## [1] 2
- Cfb-MF (3 sites–>2 sites used at the end)
## [1] 15
## [1] 10
- Cfb-ENF (7 sites–>6 sites used at the end)
## [1] 4
## [1] 7
## [1] 15
## [1] 6
## [1] 12
## [1] 14
(3) For Cfc: 0 sites
my_data_Cfc<-my_data[my_data$koeppen_code=="Cfc",]
(4) For Dfa: 0 sites
my_data_Dfa<-my_data[my_data$koeppen_code=="Dfa",]
(5) For Dfb: for DBF, MF and ENF (14 sites–>used 10 sites in the end)
my_data_Dfb<-my_data[my_data$koeppen_code=="Dfb",]
## [1] 2
## [1] 15
## [1] 7
## [1] 13
## [1] 11
- Dfb-MF (2 sites–>used 2 sites)
## [1] 14
## [1] 6
- Dfb-ENF (6 sites–>used 3 sites in the end)
## [1] 5
## [1] 14
## [1] 9
(6) For Dfc:for ENF sites (14 sites–> used 12 sites at the end)
my_data_Dfc<-my_data[my_data$koeppen_code=="Dfc",]
## [1] 6
## [1] 3
## [1] 3
## [1] 4
## [1] 2
## [1] 4
## [1] 7
## [1] 15
## [1] 11
## [1] 2
## [1] 15
## [1] 2
(7) For Dfd:for ENF sites(0 sites)
my_data_Dfd<-my_data[my_data$koeppen_code=="Dfd",]
##updates: 11-19 ## step3: save the data that label with “is_event”
**Step1: normlization for all the years in one site**
#normalized the gpp_obs and gpp_mod using the gpp_max(95 percentile of gpp)
**Step 2:Determine the green-up period for each year(using spline smoothed values):**
#followed analysis is based on the normlized "GPP_mod"time series(determine earlier sos)
- using the normalized GPP_mod to determine sos,eos and peak of the time series (using the threshold, percentile 10 of amplitude, to determine the sos and eos in this study). We selected the GPP_mod to determine the phenophases as genearlly we can get earlier sos compared to GPP_obs--> we can have larger analysis period
- update in Aug,31,2011-->limit the sos late than Feburary(Doy:60)-->in order to remove some unrelastic sos
**Step 3:rolling mean of GPPobs and GPPmod for data for all the years(moving windown:5,7,10, 15, 20days)**
**also for the data beyond green-up period--> the code of this steps moves to second step**
- at the end, I select the 20 days windows for the rolling mean
**Step 4:Fit the Guassian norm distribution for residuals beyond the green-up period**
- The reason to conduct this are: we assume in general the P-model assume the GPP well outside the green-up period (compared to the observation data).
- But in practise, the model performance is not always good beyond the green-up period-->I tested three data range:
a. [peak,265/366]
b. DoY[1, sos]& DOY[peak,365/366]
c. [1,sos] & [eos,365/366]
I found the using the data range c, the distrbution of biase (GPP_mod - GPP_obs) is more close to the norm distribution, hence at end of I used the data range c to build the distribution.
**step 5:determine the "is_event" within green-up period**
- After some time of consideration, I took following crition to determine the "is_event":
1) during the green-up period (sos,peak)-->the data with GPP biases bigger than 3 SD are classified as the "GPP overestimation points"
2) For "GPP overestimation points" --> only regard the data points in the first 2/3 green-up period as the "is_event"
3) For "is_event points", thoses are air temparture is less than 10 degrees will be classified as the "is_event_less10". I selected 10 degree as the crition by referring to the paper Duffy et al., 2021 and many papers which demonstrate the temperature response curve normally from 10 degree (for instance: Lin et al., 2012)
References:
Duffy et al., 2021:https://advances.sciencemag.org/content/7/3/eaay1052
Lin et al., 2012:https://academic.oup.com/treephys/article/32/2/219/1657108
**step 6:Evaluation "is_event"-->visualization and stats**
- two ways to evaluate if "is_event" is properly determined:
1) visulization
2) stats:
$$
Pfalse = /frac{days(real_{(is-event)})}{days(flagged_{(is-event)})}
$$